Addressing data sparsity for neural machine translation between morphologically rich languages
نویسندگان
چکیده
منابع مشابه
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Dealing with the co mplex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. ...
متن کاملImproving the Performance of Neural Machine Translation Involving Morphologically Rich Languages
The advent of the attention mechanism in neural machine translation models has improved the performance of machine translation systems by enabling selective lookup into the source sentence. In this paper, the efficiencies of translation using bidirectional encoder attention decoder models were studied with respect to translation involving morphologically rich languages. The English–Tamil langua...
متن کاملUsing POS Information for Statistical Machine Translation into Morphologically Rich Languages
When translating from languages with hardly any inflectional morphology like English into morphologically rich languages, the English word forms often do not contain enough information for producing the correct fullform in the target language. We investigate methods for improving the quality of such translations by making use of part-ofspeech information and maximum entropy modeling. Results fo...
متن کاملIdentifying main obstacles for statistical machine translation of morphologically rich South Slavic languages
The best way to improve a statistical machine translation system is to identify concrete problems causing translation errors and address them. Many of these problems are related to the characteristics of the involved languages and differences between them. This work explores the main obstacles for statistical machine translation systems involving two morphologically rich and under-resourced lan...
متن کاملA Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages
We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Translation
سال: 2020
ISSN: 0922-6567,1573-0573
DOI: 10.1007/s10590-019-09242-9